Fully Delexicalized Contexts for Syntax-Based Word Embeddings

نویسندگان

  • Jenna Kanerva
  • Sampo Pyysalo
  • Filip Ginter
چکیده

● We propose fully delexicalized contexts derived from syntactic trees to train word embeddings ● We demonstrate and evaluate our embeddings compared to vanilla word2vec ○ Nearest neighbours ○ Correlation to human judgement ○ Dependency parsing

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Delexicalized Word Embeddings for Cross-lingual Dependency Parsing

This paper presents a new approach to the problem of cross-lingual dependency parsing, aiming at leveraging training data from different source languages to learn a parser in a target language. Specifically, this approach first constructs word vector representations that exploit structural (i.e., dependency-based) contexts but only considering the morpho-syntactic information associated with ea...

متن کامل

TurkuNLP: Delexicalized Pre-training of Word Embeddings for Dependency Parsing

We present the TurkuNLP entry in the CoNLL 2017 Shared Task on Multilingual Parsing from Raw Text to Universal Dependencies. The system is based on the UDPipe parser with our focus being in exploring various techniques to pre-train the word embeddings used by the parser in order to improve its performance especially on languages with small training sets. The system ranked 11th among the 33 part...

متن کامل

Dual Embeddings and Metrics for Relational Similarity

Abstract. In this work, we study the problem of relational similarity by combining different word embeddings learned from different types of contexts. The word2vec model with linear bag-ofwords contexts can capture more topical and less functional similarity, while the dependency-based word embeddings with syntactic contexts can capture more functional and less topical similarity. We explore to...

متن کامل

Dependency-Based Word Embeddings

While continuous word embeddings are gaining popularity, current models are based solely on linear contexts. In this work, we generalize the skip-gram model with negative sampling introduced by Mikolov et al. to include arbitrary contexts. In particular, we perform experiments with dependency-based contexts, and show that they produce markedly different embeddings. The dependencybased embedding...

متن کامل

Transferring Coreference Resolvers with Posterior Regularization

We propose a cross-lingual framework for learning coreference resolvers for resource-poor target languages, given a resolver in a source language. Our method uses word-aligned bitext to project information from the source to the target. To handle task-specific costs, we propose a softmax-margin variant of posterior regularization, and we use it to achieve robustness to projection errors. We sho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017